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A  MODEL  OF  CONCEPT  LEARNING 


Richard  H.  Sherman 
ABSTRACT 

Concept  learning  is  investigated  by  constructing  a  computer 
program  named  C£,  which  is  similar  to  EPAM  in  many  respects.  The 
main  feature  of  this  research  is  the  learning  of  hierarchial  con¬ 
cepts,  i.e.,  concepts,  in  terms  of  other  concepts.  In  addition  the 
concepts  can  be  either  conjunctive  or  disjunctive.  CE  can  generalize 
on  the  instances  given  to  it,  and  it  can  adapt,  i.e.,  it  can  correct 
errors  made  in  generalization.  The  learning  ability  of  CE  is  demon¬ 
strated  on  two  example  tasks:  a  concept  learning  task,  and  a  rote 
learning  task.  Computer  experiments  confirm  the  effectiveness  of 
CE  on  these  example  tasks.  Although  not  simulated  on  the  machine, 
we  investigate  a  geometry  analogy  task  which  requires  relational 
concepts . 
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CHAPTER  I 


INTRODUCTION 

Concept  learning  is  an  important  complex  information  process. 

As  such  it  has  received  the  attention  of  scientists  who  study  the 
representation  and  processing  of  information.  Three  such  '-'roups 
of  scientists  are  phi losophers ,  psychologists  and  computer  scien¬ 
tists.  Our  approach  is  a  mixture  of  the  approach  of  psychologists 
and  computer  scientists,  our  objective  is  artificial  intelligence  - 
the  construction  of  computer  porgrams  that  exhibit  intelligence. 

The  purpose  of  Section  2.3  is  to  place  our  approach  into  its1 
proper  perspective. 

Before  we  can  do  this  we  need  a  oefinit-on  of  concept  learning. 
This  is  qiven  in  Sections  2.1  and  2.2.  Our  definition  is  suf¬ 
ficiently  general  to  include  work  in  pattern  recognition.  The 
difference  between  work  in  pattern  recognition  and  concept  learning 
is  mainly  one  of  approach.  Some  workers  ( e . q .  ,  Saner ji  [2]), 
consider  concept  learning  and  pattern  learning  to  be  the  same 
process . 

The  purpose  of  this  thesis  is  twofold.  Tne  first  is  to 
discover  a  bette'"  model  of  concept  learning.  One  important 
feature  of  this  r"ede!  is  the  ability  to  generalize.  Another 
feature  of  this  model  is  tne  ability  to  ’earn  concepts  in  terns  of 
other  concepts. 
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The  second  purpose  is  to  investigate  the  relationship  between 
concept  learning  and  a  rote  learning  theory  called  EPAM,  which  is 
also  a  computer  program.  A  computer  program  that  is  similar  to 
EPAM  in  many  respects  has  been  constructed.  The  proqram  is  called 
Concept-EPAM,  CE,  to  distinguish  it  from  other  versions  of  EPAM. 

Due  to  tht  similarity  of  CE  to  EPAM  we  give  an  explanation  of  EPAM 
’n  Section  2.5. 

The  basic  difference  between  EPAM  and  CE  is  the  introduction 
of  the  set  membership  relat  ^n,  c;  EPAM  only  uses  the  equality 
relation,  =.  The  usefulness  of  this  ne  rels.ion  is  described  in 
Chapter  III  by  exploring  how  concepts  are  stored  in  memory.  This 
extension  of  EPAM  changes  both  the  retrieval  process,  described  in 
Chapter  III,  and  the  learning  process,  described  in  Chapter  IV. 

The  learning  strategies  of  CE  are  two  interacting  pro^jsses, 
image  elaboration  and  tree  modification.  The  hie^archial  structure 
of  CE  is  apparent  in  the  discussion  of  image  elaboration. 

The  implementation  of  CE  is  described  in  Chapter  V.  The 
experiments  '•hat  have  been  given  to  CE  are  described  in  Chapter  VI. 
After  some  basic  information  is  given  to  CE,  two  tasks  are  perrormed 
a  c«...cept  learning  task,  and  a  pair  associate  task.  Although  not 
simulated  on  the  machine,  we  describe  a  geometry  analogy  task  which 
requires  relational  concepts. 

Chapter  VII  contai:  conclusions  and  observations. 


CHAPTER  II 


APPROACHES  TO  CONCEPT  LEARNING 

The  main  purpose  of  this  chapter  is  to  describe  other  research 
in  concept  learning  so  as  to  place  this  work  in  its'  proper  per¬ 
spective.  To  accomplish  this  we  irst  give  some  definitions. 

2 . 1  The  Definition  of  the  Task 

A  concept  is  a  set  of  objects  that  are  members  of  some  universe. 
A  description  of  a  concept  is  an  expression  in  some  language  which 
is  satisfied  ov  every  member  in  the  concept.  Every  concept  has  a 
description,  however,  the  "size"  of  the  expression  may  be  quite 
large.  Concept  learning  is  the  process  of  generating  a  description 
of  a  concept  given  some  objects  that,  are  in  the  concept  and  other 
objects  that  are  i.ot  in  the  concept. 

The  procedure  used  to  investigate  concept  learning  is  the 
same  one  that  is  used  in  a  psychological  experiment.  The  subject, 

CE  in  our  case,  is  presented  with  exemplars*  and  is  told  whether 
or  not  they  are  members  of  a  concept.  After  acquiring  a  descrip¬ 
tion  of  the  concept  the  subject  is  presented  with  other  exemplars, 
some  of  which  he  has  never  seen,  and  asked  if  they  are  in  the  con¬ 
cept.  This  is  basically  the  same  task  used  by  Hunt  [8]. 

*  Exemplars  are  "typical"  members  nf  concepts.  This  term¬ 
inology  is  from  Bruner  [4];  for  the  most  part  we  use  his  term¬ 
inology. 
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Jnformation  is  presented  to  CE  in  the  form  of  a  triplet. 

<exemplar>  <name>  <sign>,  For  example*  the  input  ABC  Cl  -r, 
means  that  the  three  letter  word  ABC  is  an  exemplar  of  the  con- 

] 

cept  Cl  and  CDE  Cl  -  means  that  CDE  is  not  a  member  of  concept 
Cl. 

Information  is  retrieved  from  CE  by  inputing  the  triplet, 

<exemplar-  name'  ?.  An  output  of  or  -  occurs  depending  on 
whether  the  object  (exemplar-name  pair)  -is  found  to  be  true  or 
false.  The  question,  -exemplar-  ?  ♦  ,  wi i i  retrieve  the  name  if 
the  exemplar  is  contained  in  only  one  concept.  For  example,  we 
would  like  ABC  ?  +  to  retrieve  the  name  C2.  However,  most  exemplars 
are  in  several  different  concepts,  as  we  will  see.  Examples  of 
the  retrieval  process  are  presented  in  Chapter  VI. 

General ization  is  the  ability  to  predict  whether  an  object 
never  seen  before  is  a  member  of  a  concept.  Since  only  a  few 
members  of  a  concept  are  given,  general  1 z’tion  is  necessary  to 
learn  the  concept.  Generalization  from  a  list  of  given  elements 
cannot  be  perfect.  A  contradiction  to  the  general ization  can 
appear  at  any  time  when  new  lists  of  elements  are  exhibited.  The 
predictive  ability  of  a  concept  learner  is  "good"  if  correct 
identification  occurs  most  of  the  time.  A  concept  learner  must  be 
able  to  correct  mistaken  generalization  when  necessary.  We  are 
not  concerned  with  avoiding  mistakes  a!toget'°r  but  with  "good" 
generalization  and  with  the  adaptive  abilities  of  the  concept 


learner. 
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2 . 2  Classes  of  Concepts 

Several  types  of  concepts  are  now  defined.  The  simplest  case 
is  the  conjunctive  concept.  A  conjunctive  concept  is  represented 
by  the  conjunction  of  the  features  possessed  by  all  of  the  exemplars 
in  the  concept.  For  example,  the  concept,  Cl_,  of  those  three 
letter  words  whose  first  letter  is  A_  and  whose  second  letter  is  B. 
is  a  conjunctive  concept,  We  can  denote  it  by  AB_  Cl.  The  dash 
■odicates  that  the  third  letter  is  irrelevant.  ABC  and  ABD  are 
both  exemp 1 ars  of  Cl . 

A  disjunctive  concept  is  the  union  of  several  conjunctive  con¬ 
cepts.  For  example,  the  concept,  C2,  of  those  th~ee  letter  words 
whose  first  letter  is  A  or  whose  first  letter  is  B,  is  a  dis¬ 
junctive  concept.  This  concept  can  be  represented  by  the  dis- 

„  Jr  ction  of  two  conjunctive  concepts,  (A  B _ )  C2.  The  dash 

again  indicates  irrelevant  letter  positions  and  the  wedge 
indicates  logical  "or." 

A  hierarchial  concept  defines  sets  of  objects  whose  parts  are 
also  concepts.  Let  us  consider  the  concept  'A'  C3,  where  'A' 

stands  for  any  written  letter  a,  and  C3  is  the  concepts'  name. 

ABC,  abc,  aba  ,  etc.,  are  all  members  of  C3  because  the  first 
letter  of  each  is  an  a  If  the  disjunctive,  'A',  of  all  written 
a's  is  learned,  C3  becomes  just  a  simple  conjunctive  concept, 

'A'  This  illustrates  he  advantage  (or  necessity)  of  describ¬ 

ing  one  concept  in  terms  of  mother. 

A  relational  concept  is  a  .  et  of  objects,  each  object  has  two 


(or  more)  subparts  that  satisfy  some  relation.  For  example,  the 
following  objects  have  subparts  which  are  related  by  "next1;  ACB, 
BEC,  and  FAG.  This  relational  concept  can  be  described  as  "the 
third  letter  is  'next'  after  the  first  letter."  Geometry  analogy 
problems  are  another  example  of  objects  whose  subparts  are  related. 
These  problems  are  discussed  in  Chapter  VI. 
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2 , 3  Related  Work  in  Concept  Learning 

Concept  learning  is  a  fundamental  process  according  to 
philosophers.  For  example,  Quine  [12]  considers  concepts  as  the 
basis  for  understanding  language.  Quine  [12]  states  "Conceptual¬ 
ization  on  any  considerable  scale  is  inseparable  from  language, 
and  our  ordinary  language  of  physical  things  is  about  basic 
as  language  gets."^  Quine  [12]  examines  the  notion  of  meaning  and 
objective  reference.  However,  he  is  more  concerned  with  the 
structure  of  language  then  the  process  of  acquiring  concepts. 

The  most  popular  approach,  typified  by  Sebestyen  [13],  is 
the  evaluation  of  discriminant  functions.  Statistical  techniques 
are  often  used  to  determine  coefficients  of  the  discriminant 
function.  Although  the  number  of  coefficients  that  need  to  be 
determined  are  large,  some  good  techniques  have  been  developed 
for  determining  them.  This  approach  is  called  "classification 
theory.1'  A  more  powerful  method  called  feature  extraction  is 
desireable.  Feature  extraction  is  a  process  that  generates 
"good  features."  Feature  extraction  is  a  difficult  problem  and 
few  pattern  recognizers  incorporate  it.  rather  they  are 
dependent  on  the  initial  measurements  (i.e.,  basic  features). 

The  purpose  of  feature  extraction  is  to  reduce  the  number  of  terms 
in  the  discriminant  function.  The  only  work  known  to  us  which 
incorporates  feature  extraction  is  Uhr  and  Vossler  [16].  Their 

^Willard  Quine,  ^ o r d^ and [  Obj_ec t  (Cambridge  Press,  1960)  p.3. 
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features  are  templets,  and  a  few  pertinent  templets  are  selected 
from  a  set  of  75  templets. 

A  classical  study  in  concept  learning  is  by  Bruner,  Goodnow, 
and  Austin  [4].  They  define  concepts  and  deal  mainly  with  con¬ 
junctive  concepts,  but  they  do  discuss  disjunctive  concepts. 

Several  strategies  for  learning  concepts  are  proposed.  The 
strategy  for  learning  disjunctive  concepts  only  allows  for  objects 
whose  subparts  have  binary  values.  Expei iments  on  human  subjects, 
similar  to  the  type  performed  by  CE,  are  used  to  test  the  theories 
of  learning.  Bruner's  subjects  learn  concepts  made  up  of  basic 
concepts  well  known  to  humans,  e.g.,  number  of  boarders  on  a  card, 
shapes,  etc.  Briber  [4J  does  not  investigate  how  humans  learn 
these  basic  concepts  from  sense  data.  Bruner's  strategies  are 
not  used  in  CE,  at  least  not  in  an  obvious  way. 

The  work  of  Hunt  [8]  relates  a  computer  model  of  concept 
learning  with  research  on  concept  learning  in  humans.  He  notes 

that  concept  learning  "...  is  a  hierarchial  affair.  One  must  have 

2 

a  concept  in  order  to  learn  more  concepts."  Hunt  does  not  deal 
with  learning  "concepts  of  concepts."  The  issue  of  hierarchial 
concepts  is  primarily  how  this  research  differs  from  Hunt’s.  Con¬ 
cepts  of  concepts  are  tuilt  into  the  basic  structure  of  CE. 

Banerji  [1]  also  noticed  the  useful  ess  of  concepts  concepts. 
Banerji’s  work  is  concerned  with  expressing  objects  and  concepts  in 

- - - 

Hunt,  Marin,  Stone,  Experiments  m  Induction,  (Academic  Press, 
N.Y.  and  London,  1966).  p.  1 2. . 
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a  formal  description  language.  Essentially  our  description  of 
objects  do  not  differ  from  his,  both  languages  contain  the  re¬ 
lations,  set  membership,  c,  and  equality,  =.  However,  Banerji's 
language  is  more  general,  in  particular,  the  language  described 
in  Banerji  [1]  contains  quantification  of  variables.  A  concept 
learning  algorithm  for  one  of  Banerji's  languages  was  proposed 
by  Windeknecht  [17].  The  algorithm  was  successful;  however,  it 
applies  only  to  a  restricted  class  of  concepts. 

Our  approach  combines  Banerji's  description  language  and 
EPAM.  The  language  of  CE  is  similar  to  Banerji's  and  the 
structure  of  CE  is  similar  to  EPAM.  We  describe  EPAM  next  and 
following  that  (Chapter  III)  we  describe  CE. 


2.4  Rote  Learning  and  Concept  Learning 

Rote  learning  in  humans  is  studied  in  an  experiment  called 
pair-associate  leaning,  In  this  experiment  the  subject  is 
presented  with  a  list  of  paired  terms.  The  first  term  of  the 
pair  is  called  the  stimulus  and  the  second  term  is  called  the 
response.  After  being  presented  with  the  list,  the  subject  is 
shown  a  random  stimulus  and  is  asked  to  reply  with  the  appropriate 
response.  After  his  reply  the  correct  response  is  given.  The 
list  is  then  recycled  through  until  the  subject  answers  correctly 
to  each  stimulus  on  the  list. 

The  basic  problem  facing  the  subject  is  to  learn  to  make  a 
discrimination  between  different  items  on  the  list.  The  phenomenon 
called  stimulus  general ization  occurs  >n  which  the  subject  reacts 
with  the  same  response  to  different  stimuli.  Thus  after  learning 
a  response  to  a  particular  stimulus,  a  whole  class  of  stimuli  will 
elicit  the  sam*  response. 

The  idea  of  stimulus  general ization  forms  a  connection  between 
experiments  in  rote  learning  (in  particular  EPAM  since  it  exhibits 
this  phenomenon)  and  experiments  in  concept  learning.  In  the  pair- 
associate  task  a  response  to  one  stimulus  may  be  elicited  by  other 
stimulus  which  have  not  appeared  on  the  original  training  list. 
Power  and  Trabasco  [3]  nave  suggested  an  experimental  continuum 
from  the  pair-associate  task,  in  which  each  stimulus  is  paired  with 
a  unique  response,  to  a  concept  learning  task,  where  severe 
stimulus  are  paired  with  the  same  response. 


The  process  of  general ization  is  not  as  simple  as  we  seem 
to  indicate.  There  are  influences  which  restrict  the  range  of 
general i zation  and  restrain  the  subject  from  making  the  same 
response  to  all  physically  similar  stimulus.  The  manner  in 
which  humans  draw  boundaries  between  members  of  a  class  and  non- 
members  of  a  class  is  not  easily  understood.  The  relation  be¬ 
tween  general  ization  which  occur  in  humans  in  performing  the 
pair-associate  and  concept  learn’ng  tasks  is  not  obvious.  Hunt 
states : 

The  fact  that  list  length  nas  different  effects  in  the 
concept  learning  and  the  pa  1 r-assoc ' ates  task,  taken 
together  with  the  fact  tnat  general ization  between 
stimuli  appears  to  play  a  different  part  in  each  situa¬ 
tion,  suggests  tnat  different  psychological  processes 
are  involved  and  tnat  one  simulation  ought  not  to  be 
expected  to  cover  both  fields.  3 

CE  uses  tne  same  processes  for  both  tasks. 
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2.5  EPAM 

EPAM  is  capable  of  learning  (memorizing)  objects.  The  objects 
used  in  CE  are  the  same  as  those  used  in  EPAM.  Objects  are  either 
simple  or  compound.  A  simple  object  is  a  list  of  pairs,  the  first 
member  is  an  attribute;  the  second  member  is  either  1^  or  0.  All 
simple  objects  have  the  same  attribute.  A  compound  object  is  a 
list  of  pairs,  the  first  member  'S  an  attribute;  the  second  member 
is  an  object,  either  simple  or  compound.  The  universe  of  objects 
contains  all  the  objects  possible  for  given  attributes. 

An  example  of  a  simple  object  i.  the  letter.  A,  which  can  be 
encoded  as, 

((f1  0)  (f2  1)  ...}. 

The  f.'s  are  the  attributes  of  simple  objects.  An  example  of  a 
compound  object  is  the  nonsense  syllable,  TAT  which  is  encoded 
as , 

((1st  X)  ( 2nd  A 1  (3rd  T)). 

Here  1st,  2°d  and  3rd  are  attributes  paired  with  >he  simple  objects 
X.  A,  T,  respectively. 

Before  we  explain  EPAM  learns  these  objects,  it  is  helpful 
to  discuss  the  type  of  information  processing  mode1  wit-  which  we 
are  concerned.  The  me-,  >s  shown  in  Figure  1.  e  mfor~a?  ion 
processor  receives  input  information  frpm  the  universe  and  stores 
i nfc  "mat ion  about  them  in  an  internal  representation.  Irfom-at i  on 
about  the  universe  of  objects  can  then  be  retrieve:  tv  asi'nq 
guest  -ons  of  t^e  system.  Feedtac*  *r._  -  t-e  verse  is  e~plc*ed  N 
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analyzing  responses  to  these  questions  m  relation  to  the  correct 

responses.  Internal  changes  are  then  made  to  decrease  the  number 

of  responses  in  error.  The  changes  that  occur  in  the  internal  \ 

structure  are  called  the  processes  of  learning.  The  types  of 

changes  that  occur  for  any  particular  situation  depend  on  the 

strategies  for  learning. 

The  mput  is  encoded  and  then  sent  to  the  memory.  The  out¬ 
put  is  received  from  memory  and  transmitted  to  the  environment. 

The  objects  described  above  are  outputs  of  the  encoder.  This 
thesis  is  mainly  conrerned  with  the  memory  and  the  storing  and 
retrieving  algorithms  of  the  concept  learning  system. 

The  CE  model  of  learning  has  basically  the  same  structure 
as  does  EPAK.  The  organization  and  processes  of  CE  is  different 
and  is  presented  m  Chapter  III. 

The  structure  of  EPAM  is  a  binary  decision  tree.  The  non¬ 
terminal  nodes  contain  tests.  When  an  object  enters  the  binary 
tree  it  follows  a  path  down  a  terminal  node.  Its'  path 
through  the  tree  depends  on  the  result  of  the  node  tests.  If  an 
object  passes  a  test,  the  branch  is  taken,  if  the  test  fails, 
the  branch  is  followed.  At  the  terminal  node  is  stored  an 
internal  image  of  an  object  that  comprises  all  the  information 
memorized  by  the  learner  about  that  object.  Associated  with  the 
partial  image  is  a  cue  token.  When  an  object  reaches  a  terminal 
node.,  the  cue  token  stored  at  that  node  evokes  a  new  image  from 
the  discrimination  net  in  response.  This  process  of  evoking  a 
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response  is  called  the  association  process. 

The  discrimination  learning  process  is  made  up  of  two  inter¬ 
acting  mechanisms  called  image  building  and  tree  modification. 

The  image  building  mechanism  elaborates  the  internal  image  stored 
at  the  terminal  node.  The  tree  modification  mechanism  grows  the 
discrimination  net  by  constructing  a  subnet  of  tests  based  on 
differences  round  and  appending  the  new  subnet  at  the  terminal  node. 

An  example  of  the  learning  process  in  EPAM  on  a  pair-associate 
experiment  is  now  shown. 

Let  us  assume  that  EPAM  already  has  been  presented  with  a 
list  of  pair-associate  non-sense  syllables.  Figure  2  shows  a 
snapshot  picture  of  a  typical  EPAM  discrimination  net.  The 
terminal  node  is  _A'_  where  the  blanks  indicate  that  no  information 
is  stored  about  the  first  and  third  letter  positions,  and  the 
prime  means  that  only  a  partial  image  of  an  A  is  stored.  The  cue 
token  at  the  terminal  node  is  R'E'P'.  This  cue  token  points  to 
the  response  REP  also  stored  in  the  tree  but  not  shown  in  Figure  2. 

Suppose  the  following  two  pair-associate  nonsense  syllables 
are  presented  to  the  net: 

Stimulus  Response 

XAZ  REP 

XAQ  ZOX 

The  first  stimulus  object  XAZ  is  sorted  to  the  terminal  node  shown 
in  Figure  1.  The  cue  token  now  evokes  the  stored  response,  and 
REP  is  outputed.  The  output  is  compared  with  the  paired  response 

| 

1 
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and  found  to  be  correct.  The  partial  unage  'S  now  elaborated  by 
storing  new  information,  ’t  becomes  X‘A‘_. 

Next  the  input  XAQ  sorts  to  the  same  terminal  node,  the  re¬ 
sponse  RtP  is  again  given.  This  is  a  demonstration  of  stimulus 
generalization  since  d''rrerent  st’muli  produce  the  same  response. 
However,  when  the  output  is  compared  with  the  paired  term,  ZOX,  an 
error  is  registered.  To  correct  the  error,  discrimination  between 
the  object  XAQ  and  the  partial  image  occurs  by  modifying  the  net. 

A  test  on  some  dilference  is  added  to  the  EPAM  tree  and  a  new  image 
is  formed.  The  new  tree  is  shown  in  Figure  3.  Note  that  the  re¬ 
sponse  image  Z"0"X"  is  also  stored  m  the  discrimination  net  and 
a  cue,  Z * 0 ‘ x 1 ,  is  stored  with  the  object. 


Figure  3 

EPAM  Tree  After  Learning  Additional  Stimuli 
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ORGANIZATION  OF  CE'S  MEMORY 

EPAM  is  capable  of  storing  or  memorizing  discrete  objects 
Doth  "simple"  and  "compound.  In  CE  only  classes  of  objects  are 
stored.  Each  class  of  objects  represents  a  concept.  A  terminal 
node  represents  the  conjunction  of  all  objects  that  can  be  sorted 
to  the  node  and  match  the  image  at  the  node.  Thus,  conjunctive 
concepts  are  similar  to  partial  images  in  EPAM.  Since  a  con¬ 
junctive  concept  is  located  at  one  terminal  node,  many  different 
objects  will  sort  to  the  same  terminal  node.  This  implies  that 
the  strategy  for  learning  has  to  be  different  than  that  used  by 
EPAM,  because  we  do  not  want  to  discriminate  between  the  objects 
of  a  single  concept.  The  learning  strategy  is  discussed  in  the 
next  chapter. 

n  disjunctive  concept  is  represented  in  the  CE  discrimination 
net  by  letting  two  or  more  terminal  nodes  have  the  same  name.  For 
example,  the  disjunctive  concept  C2,  described  by  the  two  con¬ 
junctive  concepts  A  C 2  and  B  C2  is  represented  at  two  terminal 
nodes  in  Figure  4.  8oth  the  name  (C2)  and  a  description  of  a  con¬ 
junctive  concept  (e.g.,  A  )  are  stored  at  a  terminal  node.  If 
we  ask  the  question,  ACD  C2  ?  (i.e.,  ' s  ACD  contained  in  the  con¬ 
cept,  C2),  the  object.  ACD  C2,  will  be  sorted  to  the  left  node 
of  the  CE  tree  shown  in  Figure  4.  Since  the  object  matches  the 
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image,  the  answer  to  the  question  js  ♦  .  The  object  BCD  C2  would 
sort  to  the  right  terminal  node  in  Figure  4,  and  again  the  answer 
*s  +.  Thus,  C2  is  the  disjunctive  concept,  A _ _ . 

The  hierarchial  concept  'A' _  C3,  where  'A'  stands  for  any 

written  letter  a^,  is  a  simple  description  of  the  disjunctive  con¬ 
cept  A _ ^a _ va _ .  Naturally,  we  want  C3  to  occupy  a  single 

terminal  node.  Therefore,  some  non-terminal  node  must  be  the  test 
"is  the  exemplar's  first  letter  a  written  a_"  (i.e.,  (ex  1st)  e  'A'). 
If  the  encoding  of  A  and  a^  are  sufficiently  different,  the  test 
(ex  1st)  =  a  would  turn  C3  into  a  disjunctive  concept  as  shown  in 
Figure  5.  The  object,  abc  C3,  would  fail  the  test  (ex  1st)  =  A 
while,  ABC  C3,  would  pass  the  test.  Hence,  the  requirement  that 
C3  (and  other  concepts  of  the  same  form)  occupy  a  single  terminal 
node  implies  that  a  non-terminal  node  may  be  a  set  membership  test 
instead  of  an  equality  test.  The  generalization  to  set  membership 
tests  is  quite  complex.  For  example,  to  answer  the  test 
(ex  1st)  e  ' B *  we  might  "throw"  the  first  letter,  say  b,  of  an 
exemplar,  back  into  the  tree.  But  b  may  be  in  many  different  con¬ 
cepts,  e.g.,  lower  case  letter,  vowel,  etc.  We  must  assume  that 
concepts  "overlap"  with  each  other  in  a  complex  way  (i.e.,  the 
intersection  of  two  concepts  is  not  necessarily  empty).  However, 
we  can  answer,  a  e  'A'  because  it  is  either  true  or  false.  This 
is  not  a  paradox;  it  implies  that  there  must  be  tests  on  names. 
Thus,  objects  are  stored  in  the  tree  with  the  possibility  of  tests 


on  names. 
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Due  to  the  importance  of  tests,  we  define  the  class  of  tests 
that  can  be  used  at  non-terminal  nodes  of  a  CE  tree.  A  test  has 
the  form,  <feature>  <relation>  <image>.  The  relation  is  either  set 
membership  or  equality.  The  feature  is  an  ordered  list  of  attri¬ 
butes  that  designates  a  subpart  of  an  object.  The  image  is  the 
information  stored  at  some  terminal  node  when  the  feature  does  not 
designate  part  of  a  "simple"  object  (otherwise,  the  image  is 
either  '1'  or  'O'  and  the  relation  must  be  equality). 

To  explain  how  tests  are  performed  on  objects,  we  briefly  give 
an  example  of  a  compound  object  in  CE.  The  compound  object, 

ABC  C2,  is  encoded  as  a  pair: 

(1)  (ex  ((1st  A)  (2nd  B)  (3rd  C))) 
and 

(2)  (na  ((1st  C)  (2nd  2))) 

The  attribute  ex  denotes  that  the  second  member  of  (1)  is  an 
exemplar;  na  denotes  that  the  second  member  of  (2)  is  a  name. 

The  second  member  of  (1)  is  also  a  compound  object.  A,B,C,2 
are  simple  objects  and  may  be  encoded,  e.g.,  as 

(3)  A  -  ((f,  1)  (f2  0)  (f3  1)  ...) 

f 1 ,  fg,  f^,  ...  denote  elementary  attributes  of  letters,  i.e., 
number  of  straight  line  segments,  arcs  of  circles,  etc.,  and 
0,  1  are  values. 

A  feature  of  a  test  designates  a  subpart  of  an  object.  The 
first  attribute  cf  a  feature  designates  an  immediate  subpart,  X, 
of  the  object.  The  second  attribute  of  the  feature  designates 
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an  immediate  subpart  of  X,  etc.,  and  the  last  attribute 
designates  the  subpart  for  wh’ch  the  test  is  applicable.  Suppose 
that  the  test,  ((ex  1st)  £  'A')  is  applied  to  ABC  C2  as  encoded 
above,  ex  designates  the  second  member  of  the  pair  (1)  and  1st 
designates  the  A  in  (1).  Thus,  in  this  case,  the  test  becomes 
A  t  'A'  whose  truth  value  depends  upon  what  has  been  learned, 
previously. 

Now  we  can  state  algorithm  1,  used  for  the  evaluation  of 
the  set  membership  test.  (Equality  tests  are  similar  to  the 
EPAM  tests . } 

1.  Find  the  subpart  designated  by  the  feature  of  the  test. 

2.  Form  the  question  -subpart-  -image  ?  and  answer  it  as 
if  it  were  an  input  question. 

As  an  example  of  algorithm  1  suppose  the  input  ABC  C 2  +  sorts 
to  the  non-terminal  test  node  (ex  1st)  •  'A1.  The  question  A  ’A' 
is  thrown  back  into  the  tree.  If  a  terminal  node  is  found  that 
matches  the  object  A  'A:,  the  test  (ex  1st)  .  'A  is  true,  other¬ 
wise  false. 

The  images  at  terminal  nodes  are  no*  explained.  An  image 
is  the  conjunction  of  several  tests.  Each  test  mas  the  same  form 
as  a  test  at  a  non -terminal  node.  However,  in  addition  to  -  -and 
.  .  the  relation  can  be  t  or  /'  (the  negation  a f  -■  and  =  ,  respect¬ 
ively).  The  usefulness  cf  these  relations  is  discusser  in  the 
next  chapter.  An  example  cf  an  image  is  the  conjunction  of  the 
fo’ lowing  tests : 
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i  ((ex  1st)  ,  A'), 

((ex  2nd)  e  B'), 
v(na)  * (C3) '  )  >. 

The  object  ACD  C3  matches  this  image  because  the  statements  AeA', 
CcB',  and  C3  -  (C3)'  are  all  true.  The  prime  on  the  letter  A  in 
the  first  test  indicates  that  A*  is  an  image  of  A.  Thus,  A'  is  not 
the  same  as  the  externally  encoded  A,  but  only  some  partial  image 
of  A.  Note  an  object  is  said  to  match  an  image  when  all  tests  from 
the  image  on  the  object  are  true.  An  object  is  different  from  an 
image  if  any  test  of  the  image  is  false  on  the  object. 

Names  of  concepts  as  well  as  exemplars  are  represented  as 
concepts  in  CE.  The  name  C3  is  different  every  time  it  is  written, 
i.e.,  C3  c3,  c3,  are  all  slightly  different.  Thus,  the  name  is 
really  a  word  whose  first  letter  is  a  C  (maybe  capital,  Roman) 
and  second  letter  is  a  3.  Due  to  the  special  role  of  names  we 
assume  that  names  can  be  learned  as  disjoint  conjunctive  concepts. 

Names  are  also  le  rned  in  the  tree.  When  an  input  sorts  to 
a  new  terminal  node  the  name  is  thrown  back  into  the  tree  and 
learned  as  a  concept,  unless  the  name  is  an  atomic  symbol  used 
for  output.  This  atomic  symbol  is  called  a  print  name  and  its'  use 
is  shown  m  Section  6. 

An  example  of  learning  a  name  is  now  given.  When  the  object 
ACD  C3  reaches  a  new  terminal  node  (a  node  with  no  image)  the 
name  C3  is  inputed  into  the  tree  as  the  list  (OBJ.  (1st  C)(2n^  3)) 
where  OBJ.  indicates  that  the  encoding  is  a  single  object;  C  and  3 


are  toe  simple  objects  that  are  subparts  of  an  input  object.  The 
test  ((na)  =  ( C3 ) ' )  is  then  formed  at  the  new  terminal  node.  The 
image  (C3)'  is  identical  with  the  image  of  the  name  C3  stored  at 
an  individual  node. 

To  execute  the  test  ((na)  =  (C3)')  on  an  object,  the  feature 
na  is  found  on  the  object.  Then  the  object  subpart  remaining  is 
matched  with  (C3)‘.  If  the  object  subpart  matches  the  image,  then 
tne  test  has  passed,  otherwise  the  test  has  failed. 

If  the  input  is  an  image  (such  cases  arise  from  algorithm  1) 
the  subpart  na  on  the  input  is  itself  an  image.  The  na  image  on 
the  i np.- 1  cannot  be  compared  with  the  test  image  since  descriptions 
cannot  be  compared.  However,  we  can  require  all  tests  in  the 
test  image,  say  (C3)',  to  be  possessed  Dy  the  subpart  na  image  on 
the  input  for  the  test  to  be  passed,  otherwise  the  test  has  failed. 
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L lAkN  1 NG  SIRAT£ijf 

As  in  EPA.m  learning  consists  of  two  processes:  1)  Con¬ 
structing  and  elaborating  images  at  terminal  nodts  of  the  net, 
and  2)  Grow  ng  the  net. 

F-rst  we  d'bi.uss  me  image  elaboration  process ,  then  we 
describe  how  the  Cc  tree  is  modified  by  learning. 

4 . 1  Image  E  < a c>  a t  mn 

The  lea'ning  strategy  applies  to  objects  of  any  complexity. 
(Complexity  s  the  depth  of  nesfng  of  compound  objects.;  If  .he 
subparts  of  a  compound  object  are  'earned  m  the  CE  tree  as  a 
member  of  a  concept,  then  the  compound  object  should  be  easier 
to  learn.  Learn  no  must  then  ’nvoive  throwing  subparts  of 
objects  back  into  the  tree  to  retrieve  ! nf orr ... t 'on  (Image 
elaboration  of  s  mple  objects  does  not  involve  .('rowing  subparts 
back.;  Suppose,  for  example,  the  input  BBC  C3  »  's  sorted  to  a 
new  terminal  node  and  we  want  to  form  an  initial  focus*  image. 

The  exemplars  immediate  subparts  a»e  thrown  into  the  tree  to  re¬ 
trieve  information,  i.e.,  the  question,  B  ?  ♦,  is  asked.  Since 
this  is  a  question  no  learning  of  the  object  will  take  place. 

Tins  usage  of  'focus'  >s  tne  same  as  in  Bruner  [4], 
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There  are  three  possibilities  in  retrieving  information. 

1.  The  subpart,  8,  is  a  member  of  only  one  concept,  ' B ' . 

Tnen  the  name  '8'  is  retrieved. 

2.  In  general  we  have  "overlapping"  concepts  and  B  will 
be  a  member  of  many  concepts.  Thus,  tests  on  names 
will  be  encountered  in  sorting,  B  ?  +,  to  a  terminal 
node.  When  a  test  on  the  name  is  encountered,  a 
search  of  the  tree  (i.e.,  an  investigation  of  both 
branches  leading  from  the  test  on  the  name)  is  used 
to  find  some  names.  If  a  terminal  node  is  found  such 
that  B  matches  the  image  of  the  exemplar,  then  the 
image  of  the  name  is  retrieved. 

3.  If  the  subpart,  B,  is  not  in  any  concept,  we  throw 
the  subparts  of  B  back  to  retrieve  information  about 
B  itself.  In  this  case  B  is  a  simple  object  and  no 
subparts  can  be  thrown  back  into  the  net.  Chapter 
VI  will  give  an  example  of  this  process  on  more 
complex  exemplars. 

To  summarize,  image  elaboration  of  an  object  X,  involves 
searching  the  CE  tree  for  those  concepts  of  which  some  subpart 
of  _K  's  a  member.  If  no  immediate  subpart  of  X  is  a  member  of  a 
previously  learned  concept  then  the  subparts  are  considered. 

Search  arises  because  some  object  may  be  in  many  different  concepts. 
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4 . 2  Modifications  of  the  C E  Tree  Due  to  Learning 

Learning  only  occurs  when  information  is  presented  to  CE; 

not  when  questions  are  asked.  !n  CE  the  first  image  at  a 

terminal  node  serves  as  a  focus  -mage  for  concept  learning. 

When  additional  objects  are  sorted  to  a  terminal  node  image 

adjustments  are  made  or  the  net  is  g-own  by  adding  a  new  test 

and  a  new  term'nal  node  To  illustrate  the  strategy  for  net 

modification,  several  cases  will  be  examined.  The  different 

cases  depend  on  the  three  variab'es:  1)  the  sign  of  the  Input 

(+  or  -),  2)  the  result  of  comparing  the  exemplar  of  the  input 

with  the  tests  at  a  terminal  node,  3)  the  result  of  matching  the 

name  ot  the  input  and  the  image  of  the  name^it  the  terminal  node. 

The  cases  afe  summarized  in  Table  1;  we  give  examples  below. 

Case  1 :  The  input,  ABC  C3  ♦,  is  sorted  to  a  terminal  node 

whose  image  is  .((ex  1st)  ►  A'),  ((na)  ~  ^ C3 ) ' ) > .  In  this  case, 

no  modification  occurs  because  no  difference  is  detected.  Both 

tests  on  the  exemplar,  ABC,  and  on  the  name,  C3,  are  true.  Thus, 

the  input  has  already  been  "learned." 

Case  2:  The  input,  ABC  C3  ♦,  is  sorted  to  a  terminal  node 

whose  image  is  ((e<  1st)  D'),  ((na)  (C3)‘>.  In  this  case. 

a  difference  is  detected  so  we  mark  the  test  irrelevant,  which 

s  t 

we  denote  by  placing  an  X  next  to  the  test,  ’.e,,  ((ex  1  )  e  D'  X). 

If  the  input,  ASC  C3  ♦,  ’S  sorted  to  the  terminal  node  whose 
image  is  ((ex  ist)  A'),  ((na)  =  (C3)'),  then  mark  the  test 
((ex  1st)  *  A1)  irrelevant  smce  ’t  s  raise  on  the  input  exemplar. 
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Case 

Number 

Name 

Match 

Exempla,  Match 
(tests  on  exemplar) 

Input 

Sign 

Action 

1 

matches 

true 

+ 

do  nothing 

2 

matches 

fal  se 

+ 

i 

i 

i 

make  test  X 
irrelevant 
or  grow  tree 

3 

matches 

all  tests  true 

1 

1 

| 

make  all  test' 
Y 

irrelevant  or 
erase  Y 
irrelevant  on 
false  test 

4 

matches 

false 

- 

do  nothing 

5 

different 

al 1  tests  true 

+ 

grow  tree 
on  name 

test 

6 

different 

false 

♦ 

grow  tree 
on  one 
difference 

7&8 

different  : 

true  or  false 

; 

do  nothing 

Table  1 

A  Summary  of  Tree  Modifications 
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Case  _3:  rhe  ’nput,  ABC  C3 
.  „nd ,  , ,  , 

whose  image  ’s  ( vex  c  j  ■  i> 
All  ex emp i a r  tests  are  true  on 

irrelevant  by  p!acmg  a  »  next 

st  „nd 

• ( (ex  >  )  A  '  ? ) ,  Hex  2  )  • 


•,  so*  ted  to  a  terminal  node 
,  ( iex  Is  t  '1  •  A’ ) ,  una;  "  (C3) ' ) ; . 
the  nput  so  we  mark  all  the  tests 
to  the  tests.  The  new  node  is  now 
B •  r ) .  ( 1 na  }  (C3> ' )  The  reason 


tor  using  a  d’T+e*ert  ’-relevance  sign  m  Case  3  from  the  sign 


used  ■ n  Case  2  s  explained  after  the  samp-e  cases. 

;  +  the  nput  ABC  C3  *•  is  sorted  to  a  termma1  node  whose  image 


s 

-(e*  lST)  A1  uex  2nd)  D'  y).  v(na)  •  tC3/ )  • 

n  d  i 

then  the  act'on  ’S  to  e-ase  the  Y  sign  on  the  test  (lex  2  )  .  D  ). 
In  r.h’s  example  all  r  empiar  tests  are  true,  however,  there  is  a 
test  marked  i  irre'evant  wh  ch  is  faise  on  the  input. 

Case  4:  The  ’nput,  ABC  C3  - ,  s  sorted  to  a  terminal  node 


whose  image  ;s 

■tie 

St 

r  ■  / 

D*  i 

( 

<na)  { C 3 ) 1 )  .  The  test  on 

t  he 

e > emp ’ 

’  a  * 

s  ♦V 

ce  and 

we  do 

noth-ng  -n  this  vase 

Case 

5: 

T  he 

nput  > 

s  ABC 

C3 

-  and  the  terminal  node  is 

(  U 

,st 

:  x  1 

j  A 

/ ,  t 

(  na  ; 

( C  4  i ' 

• 

i-nce  all  tests  on  the  exemplar 

at  e 

t»'ue . 

f  he 

t-  ee 

*  lj»  ' i*  u 

wn  by 

us ' 

nq  the  test  on  the  name. 

i  1  n. 

!  i  1 1 

Cast- 

4  1  1  ; 

6: 

T  ht?  ' 

nput  1 

\  A  nr 

>  ^  v« 

13 

and  the  image  at  the  terminal 

node  •>  ;-e»  !U,  !)').  tin  a;  (14/  )  The  test  on  both  the 

exemplar  ami  the  mime  are  fa'se,  and  the  tree  is  grown  by  usinq  a 
test  on  the  e>emp  a*  ,  '  e  ,  ,(e»  1  )  l1  ;  • 
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Case  7&8:  A  negative  input,  ABC  C3  is  sorted  to  a  terminal 
node  and  the  name  of  the  input  does  not  match  the  name  at  the 
nude.  Nothing  is  done  to  the  node  in  these  cases. 

When  all  tests  at  a  node  are  marked  irrelevant,  the  node  is 
considered  blank  and  image  elaboration  occurs.  Thus,  Table  1  only 
shows  cases  of  tree  modification. 

One  additional  case  of  tree  modification  is  not  shown  in 
Table  1.  If  the  input  arrives  at  a  terminal  node  where  ail  tests 
are  marked  irrelevant  and  in  addition  no  new  tests  are  found  by 
means  of  image  elaboration,  then  CE  assumes  the  concept  to  be  a 
disjunctive  concept.  For  example,  if  the  input,  ABC  C3  +  ,  sorts 
to  the  terminal  node 

{((ex  1st)  £  A'  X),  ((ex  2nd)  t  O'  X),  ((ex  3rd)  e  D'  Y), 

((na)  =  ( C 3 } ' ) ; 

and  no  new  tests  are  found  by  means  of  image  elaboration,  a  new 
non-ten”; , ia 1  tree  test  is  formed  from  one  of  the  irrelevant  tests 
marked  with  an  X.  The  criteria  for  determining  which  test  is  best 
illustrated  by  means  of  an  example  given  in  Chapter  V. 

A  few  descriptive  remarks  about  CE  tree  modification  are 
now  made.  For  conjunctive  concepts,  a  feature  possessed  by  a 
positive  instance  of  a  concept  but  not  possessed  by  another  positive 
instance  of  the  same  concept  is  always  irrelevant  to  tk°  concept. 
This  type  c  irrelevance  occurs  in  Case  2  of  Table  1  and  is  de¬ 
noted  by  the  sign,  X.  Another  kind  of  irrelevance,  deroted  by  the 
sign  Y,  occurs  in  Case  3.  When  a  negative  instance  of  1  concept 


possesses  a  feature  that  a  positive  instance  also  possesses,  then 
that  feature  is  only  possibly  irrelevant  to  the  concept. 

Learning  of  concepts  in  CE  involves  a  process  of  changing  the 
tests  at  the  terminal  node  image,  when  the  imaqe  has  many  tests, 
the  extent  of  the  concept  is  snail  because  only  a  few  input  ex¬ 
emplars  match  the  unaqe.  When  the  ime-  e  has  only  a  few  tests,  the 
extent  of  the  concept  is  la^qe  because  many  input  exemplars  match 
the  image.  If  a  positive  mnut  does  not  match  an  image,  then  the 
extent  of  trie  concept  increased  by  mating  tests  irrelevant.  This 
action  correspond j  to  Case  2  of  Table  1.  if  a  negative  input 
matches  an  ’mage  as  in  Case  3  of  Table  1.  then  the  extent  of  the 
concept  is  decreased  by  mating  previously  irrelevant  tests  or  by 
making  all  tests  irrelevant  thus  forcing  image  elaborat'on  to  occur. 


CHAPTER  V 


THE  IMPLEMENTATION 

The  main  reason  for  programming  CE  ;s  to  evaluate  its  per¬ 
formance  on  actual  tasks.  In  general  ,  wnen  a  few  simple  processes 
(e.g  ,  CE  image  elaboration  and  net  modification  processes)  inter¬ 
act  m  a  complex  way  the  result  may  be  unpredictable.  Another 
purpose  of  coding  the  program  is  to  demonstrate  that  the  theory  of 
concept  learning  presented  m  this  report  is  sufficiently  complete 
and  detailed  to  be  implemented  on  a  digital  computer.  The  computer 
program  provides  a  check  on  errors  in  the  theory  of  CE.  These 
errors  can  be  simple  omissions  of  details,  simple  loqical  errors , 
or  substantial  omissions  of  pertinent  theory.  Chapter  III  ana 
Chapter  IV  give  a  sufficient  statement  of  CE's  theory;  the  precise 
CE  model  ’$  only  stated  in  the  working  proqram. 

CE  is  coded  in  a  machine  language  called  Sleuth  II  on  the 
Univac  1107  computer  at  Case  Western  Reserve  University  List 
processing  facilities,  (see  Ernst  [5],’.  consist  of  a  number  of 
subroutines  wh’ch  can  be  referenced  f-om  the  Sleuth  II  language. 
Ernst  [5]  shows  how  the  description  lists  of  I  Pi.  and  he  property 
lists  of  LISP  can  be  converted  into  type  lists.  The  objects  of 
Section  2.h  (used  in  CE)  can  easily  be  encoded  into  :he  type 
lists  of  Ernst  [5].  The  reasc  for  c noosing  typed-! 'sfs  to 
represent  data,  besides  being  available,  is  to  combine  qualities 
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of  both  lists  and  description  lists  into  a  single  data  structure. 

The  top  level  organization  of  CE  is  shown  in  Figure  6.  The 
executive  runs  the  experiment  and  as  such  is  a  non  essential  part 
of  the  program.  The  executive  is  described  in  the  first  section 
of  this  chapter.  The  memory  is  described  extensively  in 
Chapter  III  and  will  not  be  further  discussed  in  this  chapter.  In 
Section  5.2  the  storage  and  retrieval  algorithms  are  discussed. 


-37- 


5 • 1  The  Executive 

The  executive  controls  the  flow  of  inputs  to  memory. 
Attribute-value  pairs  on  objects  are  given  to  the  executive.  In¬ 
puts  are  then  generated  by  augmenting  the  given  descriptions  with 
random  noise  at  the  lowest  level  (i.e,  simple  objects).  For  ex¬ 
ample,  a  list  of  relevant  feature  values  (e.g.,  8  out  of  15 
features  are  relevant)  of  a  B  is  used  as  a  description.  Remaining 
feature  values  are  then  added  at  random  to  produce  a  typical  EL 

The  storing  and  retrieving  algorithms  are  techniques  for 
processing  information.  In  CE  the  rote  learning  task  requires  all 
inputs  to  have  a  '+'  sign  while  for  the  concept  learning  task  in¬ 
puts  have  signs  also.  The  executive  has  the  flexibility  of 
monitoring  both  types  of  tasks. 
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5 • 2  Storing  and  Retrieval 

The  CE  program  is  written  to  process  inputs  of  arbitrary  com¬ 
plexity.  The  program  subroutines  must  be  able  to  employ  the  same 
processes  on  data  of  arbitrary  depth  in  their  r'st  structures. 

This  is  accomol ished  by  recursively  calling  the  subroutine  within 
itself  depending  on  the  level  of  the  list  structure. 

The  general  flow  of  control  in  the  CE  program  is  diagramed 
in  Figure  7.  To  understand  the  sequence  of  operations  we  shall 
follow  an  inpu_  through  the  flow  chart.  The  input  is  processed  as 
follows : 

1.  The  first  node  in  the  tree  is  located. 

2.  The  node  may  be  a  test  (non-terminal  node)  in  which  case 
control  is  transfered  to  step  3.  If  the  node  is  a 
terminal  node  control  transfers  to  step  5. 

3  The  node  is  a  test.  The  test  is  executed  on  the  input 
object.  Because  of  algorithm  1  a  question  may  be  asked 
of  the  tree  in  executing  this  test.  This  question,  con¬ 
structed  by  a  test  in  the  tree,  is  called  an  internal 
question  as  opposed  to  an  external  question  asked  by  the 
experimenter.  The  internal  question  is  thrown  back  into 
the  tree  to  step  1,  recursively.  The  yes-no  answer  is 
then  received  and  control  continues  to  step  4. 

4.  The  next  node  is  found  and  control  goes  to  step  2. 

5.  The  node  is  a  terminal  node.  The  input  is  matched  with 
the  terminal  node.  Depending  on  the  sign  of  the  input, 
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Figure  7 


General  CE  Flow  Chart 
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control  may  be  transferee)  to  one  of  the  following 
places: 

a.  If  the  input  is  an  internal  question  then  the 
yes-no  answer  is  given  according  to  the  match 
at  the  node.  Control  is  returned  to  the  re¬ 
cursive  test  in  step  3. 

b.  If  the  input  is  an  external  question,  asked  by 
the  experimenter ,  it  may  be  one  of  two  types. 

If  the  input  is  a  yes-no  question, 

^exemplar.-  cname^  ?,  then  the  output,  is  given 
and  control  exits  to  step  6.  If  the  question  is 
to  retrieve  the  name  of  an  exemplar, 

'■exemp i ar>  ?  «■.  the  name  at  the  terminal  node  is 
thrown  back  into  the  tree  recursively  to  retrieve 
the  individual  name.  The  name  is  then  outputed 
and  control  exits  to  step  6. 

c.  If  the  input  object  is  information  presented  by 
the  experimenter  to  the  tree,  control  passes  to 
the  learning  strategies.  Image  elaboration  or 
tree  modification  (Table  1)  occurs  depending  on 
the  match  at  the  terminal  node. 

6.  The  next  input  is  now  given  to  CE  by  going  to  step  1. 

The  recursive  nature  of  CE  can  be  noted  by  examining  the  flow 
chart.  The  entire  routine  (step  1)  is  called  recursively  by  step  3 
because  of  algorithm  1  and  by  step  5  because  of  image  elaboration. 


Also  note  that  step  5  calls  step  3  in  matching  an  input  to  a 
terminal  node. 


CHAPTER  VI 


THE  EXAMPLE  tASKS 

CE  must  I  earn  some  basic  information  before  attempting  a 
more  difficult  task.  First,  we  describe  the  basic  learning  ex¬ 
periment  and  then  three  more  complex  tasks:  1)  a  concept  learning 
task  similar  to  those  in  Hunt  [8],  2}  a  paired  associate  task,  and 
3)  a  geometry  analogy  problf  (see  Evans  [6]j.  The  latter  task 
was  not  simulated  on  the  computer  due  to  the  additional  programming 
it  required. 

6 . 1  Basic  Information 

The  example  tasks  given  to  CE  involve  processing  of  compound 
objects.  In  order  to  learn  concepts  about  compound  objects  (in  a 
reasonable  way),  certain  elementary  concepts  involving  the  sub¬ 
parts  of  the  compound  objects  should  be  learned  first.  Thus,  to 
learn  concepts  about  letter  strings  and  to  use  letter  strings  the 
alphabet  should  be  learned.  Step  a,  given  below,  is  to  present  CE 
with  written  letters  as  exemplars  and  their  names  which  are  atomic 
symbols.  The  written  letters  are  simple  objects.  A  typical  list 
of  inputs  of  this  type  is: 
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Step  a: 

' simple  object- 

•atomic  symbols' 

-  sign 

A 

‘A’ 

+ 

A 

'A' 

+ 

b 

'A' 

- 

A 

'A' 

+ 

B 

'A' 

- 

A 

'A' 

+ 

The  encoding  of  two  simple  objects  in  general  will  be  different. 
Thus,  the  A's  will  all  be  slightly  different.  However,  any  two  A's 
should  be  more  similar  than  an  a  and  an  A.  The  effect  of  this 
learning  is  to  discover  the  key  features  (attributes)  of  the  dif¬ 
ferent  letters.  Key  features  depend  not  only  on  the  encoding  of 
the  different  a's  but  also  on  how  they  differ  from  the  encodings  of 
the  other  letters. 

All  simple  objects  have  15  basic  features.  Seven  of  these 
features  are  relevant  for  all  letters  and  the  other  eight  are 
noise  features.  Thus,  typical  instances  of  a  B  are  generated  by 
providing  8  random  feature  values.  The  criteria  for  when  a  con¬ 
cept  has  been  learned  is  that  one  complete  cycle  through  the  list 
produces  correct  classification  for  each  element  on  the  list. 

In  the  first  step  30  Members  and  non-n  mbers  of  the  concept 
of  a  capital  roman  A  was  given.  This  list  was  cyclec  through 
twice  to  complete  learning.  Of  the  30  A's  yiven,  24  were  positive 
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instances  of  the  concept  'A'  and  6  were  negative  instances.  Table 
2  summarizes  the  number  of  simple  objects  presented  to  CE  in  order 
to  learn  the  alphabet, 

''able  2  shows  that  letters  which  occured  early  in  the  position 
of  learning  were  hardest  to  learn.  This  occured  because  later 
letters  had  a  di scrimination  net  to  pick  out  the  relevant  feature^ 
of  inputs,  hence,  learning  was  faster.  All  inputs  of  Table  2  were 
given  to  CE  and  stored  m  memory  in  30  seconds  on  the  Uni  vac  1107 
computer. 

Letters  are  in  more  than  one  concept.  For  example,  the 
written  letter  A  is  an  'A',  a  capital  letter,  a  vowel,  etc.  Thus, 
step  b  gives  innuts  to  CE  which  have  letters  as  exemplars  and  a 
string  of  letters  as  a  name.  For  example,  in  the  in..ut  A  CAP  +, 
CAP  is  a  compound  object  naming  the  concept  of  'capita!"  letters. 

In  step  b,  the  concept  vowel  (VOW),  capital  letter  (CAP),  and 


small  letter  (SML)  are  learned. 

Step  b:* 

<simple  objects  name-  sign 

a  VOW  + 

a  VOW  + 

A  CAP  + 

A  VOW  ♦ 

a  SML  + 


★ 

We  would  like  to  present  'A'  VOW  ♦  as  an  input,  but  ’A'  is  a 
print  name  and  cannot  he  used  as  an  exemplar. 
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exempler 

•  name 

number 
of  in¬ 
puts  on 
a  list 

number 
of  '  +  ' 
i nstance 

number 
of  1 
instance 

number  of 
cycles 
through  tl 
list 

A 

'A' 

30 

24 

6 

2 

a 

'A' 

jC 

24 

6 

2 

B 

’  B  1 

3u 

24 

6 

2 

b 

1 B ' 

30 

24 

6 

0 

c 

'  F  1 

2o 

16 

4 

2 

f 

'  F  ' 

30 

16 

4 

2 

J 

'J' 

o 

CM 

16 

4 

2 

K 

'  K 1 

20 

1c 

4 

2 

X 

X' 

10 

8 

2 

2 

V 

'  Y  ' 

10 

.5 

2 

2 

z 

'Z' 

10 

8 

O 

c 

2 

Fa!  : c  2 

A  partial  iist  of  ini  ‘a  yiven  to  Ci  in  step  a.  The  inputs  appear 
m  order  of  their  position  in  I e a r r.  1  na . 
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Since  the  alphabet  has  already  been  learned  by  CE,  much  of 
the  time  spent  in  step  b  was  in  learning  the  names.  Table  3 
summarizes  the  inputs  given  to  CE  xjr  step  b.  The  inputs  of 
Table  3  were  learned  by  CE  in  20  seconds  on  tne  Imvac  1107 
computer. 

A  snapshop  picture  of  the  CE  tree  after  1  naming  the  basic 
information  ’s  shown  in  Figure  8.  Notice  th-  .  names  of  concepts 
are  always  thrown  back  into  the  tree  and  learne  ,  unless  the  name 
is  a  print  name.  For  example,  CAP  is  in  the  tree  as  a  letter 
triplet  not  as  an  exemplar-name  pair.  Notice  also  that  tests  in 
Figure  8  may  have  three  branches.  When  the  feature  of  a  test 
does  not  designate  a  subpart  ot  the  input  the  test  is  undefined. 
Hence,  a  third  branch,  which  discriminates  inputs  with  different 
descriptions,  may  exist.  The  example  tasks  which  use  this  basic 
information  are  given  next. 


eyeniD  1  er 


name 


eniplei 

r  name 

number 
of  in¬ 
puts  on 
a  list 

number 
of  '  + 
i nstance 

number 
of  ' 
i  nstance 

number  of 
cycles 
through  the 
list 

A 

VO  A 

1  r 

1  v 

8 

2 

1 

A 

CAP 

1C 

8 

2 

1 

a 

SMI 

1C 

8 

9 

L 

1 

b 

SML 

10 

8 

l 

1 

q 

CAP 

10 

d 

2 

1 

j 

CAP 

10 

8 

2 

1 

K 

:ap 

1C 

8 

9 

1 

Y 

(  d 

V.- 

10 

8 

o 

L 

1 

2 

:  hF 

10 

6 

2 

1 

Table  3 

A  partial  list  i.  f  inputs  given  to  CE  in  step  b. 

f 

i 
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6.2  Concept  Learning  Task 

A  list  of  letter  strings  are  given  to  CE.  The  exemplars  are 
'typical'  instances  of  the  disjunctive  concept.  Cl,  of  those  three 
letter  words  whose  third  letter  is  a  vowel  or  whose  second  letter 
is  a  C  or  both,  /he  growth  of  the  CE  tree  is  shown  for  each  input 
in  sequence,  using  the  learning  strategy  of  Table  1.  The  inputs 
are  as  follows: 


<exeinplar> 

<name> 

<sign> 

1. 

ABE 

Cl 

+ 

2. 

GCA 

Cl 

+ 

3. 

CEB 

Cl 

- 

4. 

BCA 

Cl 

+ 

5. 

GDF 

Cl 

- 

6. 

DCB 

Cl 

+ 

7. 

AGB 

Cl 

- 

The  actions  of  the  corresponding  (serial)  inputs  are  given  in 
order.  The  actions  are  numbered  according  to  the  number  of  the 
corresponding  input. 

1.  The  initial  image  due  to  input  1  is  from  throwing  the 
first  letter  (i.e.,  A  ?  +)  into  the  tree  and  retrieving 
all  names.  The  name.  Cl,  is  also  thrown  into  the  CE 
tree  and  learned.  The  initial  terminal  node  is  now: 
{((ex  1st)  c  (CAP)'),  ((ex  1st)  e  'A'),  ((ex  1st)  e  (VOW)'), 

((na)  =  (Cl)')). 
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2.  The  tests  ((ex  1st)  e  'A')  and  ((ex  1st)  e  (VOW)1)  are 
made  X-irrelevant  (See  Section  4.2). 

3.  All  tests  are  true  on  the  input;  ((ex  1st)  e  (CAP)')  is 
made  Y-irre  ant. 

4.  Since  there  are  no  longer  any  relevant  images  at  the 
node,  CE  decides  to  elaborate  the  image  by  adding  the 
new  tests;  ((ex  2nd)  e  'C'),  ((ex  2nd)  e  (CAP)'). 

5.  The  test  ((ex  2nd)  e  (CAP)')  is  made  Y-irrelevant. 

6.  Nothing  is  done  to  the  node. 

7.  Nothing  is  done  to  the  node. 

We  now  recycle  the  list  of  inputs  by  presenting  the  same  inputs 
to  CE  again. 

1.  The  test  ((ex  2nd)  e  ' C ' )  is  made  X-irrelevant.  The 
node  is  now  elaborated  by  adding  three  new  tests; 

((ex  3rd)  e  *E' ).  ((ex  3rd)  e  (CAP)')  and 

((ex  3rd)  e  v /OW) ' ). 

2.  The  test  {(ex  3rd)  t  ' E 1 )  is  made  X-irrelevant. 

3.  Nothing  is  done  to  the  node. 

4.  Nothing  is  done  to  the  node. 

5.  Nothing  is  done  to  the  node. 

6.  The  test  ((ex  3rd)  e  (VOW)')  is  made  X-irrelevant.  The 

node  is  now: 

{((ex  1st)  r  'A'  X),  ((ex  2nd)  e  'C'  X),  ((ex  3rd)  E  (VOW)'  X), 
((ex  1st)  e  (CAP)'  Y).  ((ex  2nd)  e  (CAP)’  Y),  {(ex  3rd)  c  ' E '  X), 
((ex  1st)  e  (VOW)'  X),  ({ex  3rd)  c  (CAP)'  Y)> 


-51- 


7.  The  test  ((ex  3rd)  f  (CAP)')  is  .ade  Y-irrelevant. 

Until  now  CE  has  tried  to  describe  the  concept,  Cl,  con¬ 
junctively  and  has  failed.  Consequently,  CE  tries  to  describe 
the  concept  disjunctively  by  growing  the  tree.  CE  must  be 
careful  on  what  test  to  grow  the  tree.  One  of  the  tests  marked 
X-irrelevant  will  be  used  as  a  new  non-terminal  node.  The 
criteria  is  at  least  two  positive  inputs  must  pass  the  test.  In 
this  way  CE  tries  to  form  a  conjunctive  concept  of  a  subset  of 
the  positive  exemplars.  Since  some  positive  inputs  pass  and  some 
fail  the  test,  the  test  acts  to  discriminate  between  conjunctive 
subparts  of  a  disjunctive  concept.  This  criteria  for  growing  the 
net  is  not  shown  in  Table  1  because  it  is  best  illustrated  with 
an  example.  The  list  of  inputs  is  now  recycled  again. 

1.  The  tests  ((ex  1st)  e  'A'),  ((ex  1st)  e  (VOW)'), 

((ex  3rd)  f  ' E ' ) ,  ((ex  3rd)  e  (VOW)1)  are  noted  as 
relevant  to  input  1. 

2.  The  test  ((ex  3rd)  e  (VOW)')  is  noted  as  relevant  to 
two  positive  inputs.  Hence,  by  the  growing  criteria 
(above)  the  test  ((ex  3rd)  f.  (VOW)')  is  the  new  net 
discrimination  test.  The  subtree  now  looks  like: 
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The  initial  image  of  input  2  is  taken  to  be  those  tests 
marked  X-irrelevant  that  are  passed  by  input  2. 

For  all  remaining  inputs,  nothing  is  done  to  modify  the  tree; 
the  concept  has  been  learned.  The  Appendix  shows  a  computer 
printout  of  the  learned  concept,  i.e.,  a  subtree  of  the  trtal 
memory. 

Let  us  give  CE  a  new  list  of  letter  strings.  The  exemplars 
of  this  new  list  are  instances  of  the  conjunctive  concept,  C2,  of 
those  three  letter  words  whose  first  letter  is  a  B  and  whose  third 
letter  is  an  A. 

The  inputs  are  as  follows: 


<exemplar> 

<name> 

<sign> 

1. 

BOA 

C2 

+ 

2. 

BAE 

C2 

- 

3. 

BEA 

C2 

+ 

4. 

CDB 

C2 

- 

i 

sj  • 

ABA 

C2 

- 

1. 

The  initial 

input  sorts  to  the  left  node 

because  the 

test  on  the  first  letter  is  true.  Since  the  exemplar 
matches  the  node,  a  test  on  the  name  is  the  new  net 
discrimination.  The  tree  is  now: 


•  ■  — — -  *—***«+.■  ..  „  *  ^  „ 


(Ux  3ra )  £  (VOW)  ■ 


( (na)  -  (Cl)') 


((ex  2nd)  e  'C 
((na)  =  (Cl)1) 


3.  Nothing  is  done  to  the  tree. 

4.  Nothing  is  done  to  the  tree. 

5.  The  input  matches  node  N2.  Any  test  marked  Y-irrelevant 


that  is  false  on  this  input  is  made  relevant.  Thus  the 
test  ((ex  1st)  t  ' B 1 )  is  made  relevant  by  erasing  the 
Y  sign. 

The  list  is  recycled  again.  All  inputs  are  satisfied,  hence,  the 
concept  has  been  ^arncJ.  Node  N2  is  now: 

{((ex  1st)  e  *B* ),  ((ex  1st)  e  (CAP)'  Y),  ((ex  2nd)  e  'E'  X), 

((ex  2nd)  e  (VOW)’  X),  ((ex  2nd)  e  (CAP)'  Y),  ((ex  3rd)  e  'A'), 
((ex  3rd)  e  (CAP)'  Y)}. 
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6 . 3  Paired  Associative  Task 

Pairs  of  letter  strings  are  presented  to  CE.  Hunt  [8]  points 
out  that  this  task  can  be  viewed  as  a  pair  associative  task  where 


exemplars  are  stimulus  and  names 

are  responses. 

<exemplar  or  stimuli > 

<name  or  response> 

<sign> 

JAD 

BIL 

+ 

TOD 

BER 

+ 

KEP 

NIS 

+ 

DAT 

REB 

+ 

The  first  input,  JAD  BIL  +  ,  forms  an  initial  focus  image  by  throwing 
an  examplar  subpart  into  the  tree,  i.e.,  J  ?  +,  and  retrieving  some 


names.  The  terminal  node  is: 

{((ex  1st)  e  -J' ),  ((ex  1st)  e  (CAP)'),  ( ( na )  =  (BIL)')). 
Again,  the  name,  i.e.,  BIL,  is  learned.  The  next  input,  TOD  BER  *, 
is  given  to  CE  and  a  difference  is  noted  in  the  exemplar,  hence, 
tree  modification  take:  Diace  as  shown  below: 


The  next  input,  KEP  NIS  +,  is  sorted  to  the  right  node  (above) 
and  a  difference  is  detected.  The  tree  is  grown  using  the  test 
((ex  1st)  e  * T * )  as  the  new  net  discrimination.  The  next  input, 

DAT  REB  + ,  sorts  to  the  right  node  and  a  difference  is  detected. 
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The  final  tree  is  shown  in  Figure  9. 

Viewing  this  as  a  pair  associative  task,  CE  will  produce 
responses  by  asking  questions  like  JAD  ?  +.  Although,  in  general, 
such  a  question  may  have  many  answers  (due  to  overlapping  concepts), 
in  a  paired  associative  task  each  such  question  has  a  unique  answer 
because  each  stimulus  has  a  unique  response. 
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6.4  Geometry  Analogy  Task 

Until  this  point  the  examples  have  bean  run  on  the  Univac  1107 
computer.  The  geometry  analogy  task  has  not  been  programmed 
because  of  the  separate  executive  needed  for  learning  relational 
concepts  and  algorithm  2.  CE  does  not  knew  when  to  stop  describing 
concepts  conjunctively  and  disjunctively  and  to  start  describing 
concepts  as  relations. 

For  the  geormetry  analogy  task,  CE  should  first  learn  the 
subparts  of  the  analogy  pictures,  i.e.,  the  simple  geometric 
figures.  Simple  geometric  figures  are  encoded  as  simple  objects. 
Names  of  the  simple  objects  are  letter  strings  as  shown  below. 

In  step  a,  the  concept  triangle  (TR),  square  (SQ),  rectangle  (RECT), 
etc.,  are  learned.  The  simple  geometric  figures  were  not  needed 
for  the  CE  program  because  the  geometry  analogy  task  was  not  imDle- 
mented.  Sample  inputs  are  given  in  step  a. 

Step  a: 

<simp1e  object*  <name>  <sign> 


□ 

□ 

a 

a 

o 


sq 

sq 

rect 

sq 

sq 


As  an  example  of  learnir.g  more  complex 


♦ 

+ 

♦ 


objects,  we  shall 


$ 
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consider  the  geometry  analogy  task*,  as  typified  by  the  one  in 
Figure  10.  The  problem  is  "picture  1  is  to  picture  2  as  picture  3 
is  to  which  of  the  following  pictures."  The  answer,  according  to 
most  people,  is  picture  6.  The  geometry  analogy  of  this  problem 
is  that  'the  shape  of  the  outer  most  figure  in  the  left  diagram 
is  the  Some  as  the  shape  of  the  figure  in  the  right  diagram." 
Notice  the  two  figures  in  the  diagram  are  not  identical;  we 
cannot  say  .  The  two  figures  are  the  same  because  both 

have  the  same  name  (square)  with  respect  to  a  property  (shape). 
Thus,  before  CE  can  learn  analogy  problems  it  must  have  acquired 
the  concept  of  shape.  Since  names  are  stored  in  CE  as  concepts, 
instances  that  were  previously  only  names  can  be  used  as  exemplars 
of  new  concepts.  To  this  end,  CE  is  told  that  a  square  is  in  the 
concept  shape;  a  square  is  a  four  sided  figure  (FSF);  etc.  These 
inputs  were  not  needed  tor  the  CE  program  because  the  analogy  task 
was  not  given  to  CE.  Tne  new  executive  routine  needed  far  the 
analogy  tas4,  is  explained  in  this  section.  Step  b  for  the 
geometry  analogy  task  is  now  given. 


* 

The  analogy  problem  is  the  same  as  considered  by  Evans  [6]. 
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1 

□ 


2 


® 

x‘  ®  A  ©  O  □ 

4  5  6  7  8 


Figure  10 

A  sample  geometry  analogy  problem  from  Evans  [6].  The  problem  is 
"1  is  to  2  as  3  is  to  X."  The  answer  is  X  =  #7. 
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Step  b: 

exemplar 

name  - 

<sign> 

SQ 

SHAPE 

+ 

sq 

shape 

+ 

sq 

FSF 

+ 

rect 

shape 

Rect 

fsf 

+ 

The  CE  subtree  after  learning  the  inputs  of  step  a  is  illustrated 
in  Figure  11.  Note  that  we  have  introduced  another  level  in  the 
hierarchy  of  concepts.  That  is  0  •  sq  and  so  t  shape;  however, 

0  i  shape.  In  general,  there  will  be  other  concepts  at  the 
same  "level"  as  shape,  e.g.,  four  sided  figures  cf  which  sc[  is 
also  a  member. 

A  difficulty  introduced  by  the  geometry  analogy  task  is  the 
need  of  "relational"  concepts.  That  is.  a  test  like  "first  is 
the  same  as  the  second."  The  basic  idea  is  contained  in  an 
"extended"  question  as  0  ?  +  (SHAPE)'.  Since  the  figure 
has  several  names,  the  input  will  sort  to  a  test  on  names,  say 
( ( na )  =  (SQ)').  To  answer  this  test  we  form  the  question 
(SQ)'  (SHAPE)'  ?  and  throw  it  'nto  the  tree.  If  the  answer  is 
negative,  then  the  test  ((na)  =  (SQ)')  is  false. 

The  description  of  the  new  test  is  needed  for  analogy  problems 
follows.  The  format  of  the  test  Is  S(< feature^*  ‘featu^-’  <name>). 
The  logical  interpretation  of  the  test  is:  "There  exists  a  name 


Q')) 


{ex((lSt  e  'S'),  (2nd  e 
na(lst  e  1 S - ) } 
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y  such  that  two  exemplar  subparts,  *  feature, -feature2>,  are 
members  of  y  and  y  is  a  member  of  'name-.'  The  3  stands  for  the 
predicate,  similarity. 

The  test  can  be  applied  to  the  input 

m  ■■  □ 

The  encoding  of  the  exemplar  name  pair  is  given  in  Figure  12.  Thus, 
the  test  S((ex  left  inside)(ex  right)! SHAPE ) ' )  asks  whether  the  sub¬ 
part  of  the  exemplar  on  the  left-inside,  i.e.,  D  ,  is  th  same 
as  the  exemplar  on  the  right,  i.e.,  □  ,  wth  respect  to  shape. 

Algorithm  2  for  the  evaluation  of  a  similarity  test  is  now 
given.  The  test  S(- feature,-  feature2  name^)  operating  on  an  in¬ 
put  is  interpreted  as  there  exists  a  n,  such  that  e,  n ,  ♦  , 
e2  n,  ♦  ,  and  n-(  n2  +  where  e,  and  e2  are  exemplar  subparts  desig¬ 
nated  by  feature,  and  'feature^  and  n2  is  -name'.  To  find  if 
there  exists  a  n,,  we  ask  the  questions  e,  ?  +  and  e2  ?  +.  This 
implies  that  non-terminal  nodes  like  ((na)  -  n)  will  be  encountered. 
The  test  ((na)  -  n)  will  be  true  or  false  depending  on  the  truth  of 
n  n2  ?.  In  this  way  the  inputs  e,  ?  *  and  e2  ?  +  will  go  to 
unique  terminal  nodes,  and  the  names  will  be  retrieved.  The 
similarity  test  is  true  if  the  two  names  are  identical. 

As  an  example  of  algorithm  2,  consider  the  input, 

H  :  □  05  *• 

and  the  test,  5((ex  left  inside)(ex  right) (SHAPF V ) .  The  first 
feature,  (ex  left  inside),  designates  the  subpart  □  ,  and  the 


0  '  □ 


C5  + 


{ex( ( 1  eft  S$)  (right  s) 
na((1St  C)  (2nd  5))} 

S$  *  ((Outside  S)  (Inside  s)) 
s  =  ((fl  VI) . (fl 5  V2) ) 


Figure  12 

A  geometry  analogy  input  and  its  encoding. 


-65- 


second  feature,  (ex  right),  designates  u  •  The  test  becomes 
S(  Q  ,  [  j  .  (SHAPE)').  To  answer  this  test  the  question 
CD  ?  +  i s  asked.  It  goes  to  a  non-terminal  node  with  the  test, 
((na)  =  (SQ}‘),  (See  Figure  8),  To  answer  the  test  we  throw 
SQ '  SHAPE'  ?  into  the  tree  and  get  the  answer  +.  Thus, 

((na)  =  (SQ) ' )  is  true. 

Note  that  the.  name  retrieved  in  the  above  example  may  not  be 
desirable.  Suppose  the  question  □  ?  +  (SHAPE)'  reaches  a  non¬ 
terminal  node  with  the  test,  ((na)  =  (RECT ) ' ) .  Since  the  question, 
RECT ’  SHAPE'  ?  is  true,  the  name,  RECT,  would  be  retrieved  instead 
of  SQ.  However,  all  squares  are  rectangles. 

An  example  task  of  compound  (analogy)  figures  and  names  is  now 
given.  The  executive  which  controls  the  experiment  is  explained 
at  the  end  of  the  example.  The  task  is  for  CE  to  choose  which  in¬ 
put  is  most  nearly  the  same  as  the  first  input. 


exemplar 

name  ■ 

<sign 

C5 

+ 

2.  @  © 

C5 

? 

3.  ©  :  & 

C5 

? 

4.  ©  :  © 

C5 

? 

5  ©  t  0 

C5 

? 

6.  ©  :  Q 

C5 

? 

CE  will  choose  input 

5  to  be  true  and  all 

other  inputs 

(except  1)  to  be  false;  thus,  5  is  the  answer. 

Input  1  is  sorted  to  a  new  node  and  a  focus  imag^  is  formed. 
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Thus,  the  exemplar  subpart  (□)  is  thrown  into  the  tree  ( i . e . , 

El  ?  +)»  but  ores  not  match  a  node.  Then  the  sub-subpart  is 
thrown  into  the  tree,  (i.e.,  □  ?  +).  The  name,  SQ *  is 
retrieved.  We  then  throw  the  name  back,  (i.e.,  SQ'  ?  +)  and 
retrieve  SHAPE'.  A  match  of  the  subparts  of  the  exemplar  yields 
tha  following  terminal  node: 

{ S ( (ex  left  outside)(ex  rigi.t)  SHAPE1),' 

S((ex  left  inside)(ex  right)  SHAPE'), 

S((ex  left  outside)(ex  left  inside)  SHAPE1), 

((na)  =  (C5) ' ) } 

It  complete  the  analogy  task  the  executive  gives  the  remaining  in¬ 
puts  (2  through  6)  to  CE.  If  none  of  the  inputs  match  the  terminal 
node,  CE  will  employ  Table  1  to  modify  the  image  until  exactly  one 
input  (2  through  6)  matches  the  node.  Thus,  the  second  and  third 
tests  are  marked  irrelevant  and  input  5  matches  .he  image. 


CHAPTER  VII 


CONCLUSIONS  AND  OBSERVATIONS 

The  main  feature  of  this  research  is  the  learning  of 
hierarchial  concepts,  i.e.,  concepts  in  terms  of  other  concepts. 

The  concepts  can  be  either  conjunctive  or  disjunctive.  CE  can 
gem  -aWe  upon  the  instances  given  to  it.  In  addition,  it  can 
correct  errors  made  in  generalizing.  To  the  best  of  our  knowledge 
no  other  computer  program  can  learn  hierarchial  concepts,  mainly 
because  of  the  intrinsic  dynamics  of  the  problem. 

Learning  concepts  of  concepts  is  closely  related  to  the  dif¬ 
ficult  problem  of  feature  extraction  described  in  Chapter  2.  The 
te^ts  of  CE  can  be  viewed  as  Boolean  valued  features.  For  example, 
for  any  object  the  feature,  (ex  lSt)  t  'A',  is  either  true  (has  a 
value  1)  or  false  (has  a  value  0).  The  problem  of  learning  a  con¬ 
cept  is  the  problem  of  finding  the  relevant  features.  In  CE  this 
involves  finding  the  concepts  possessed  by  the  subparts  of  an 
exemplar.  In  addition  tnese  features  once  "extracted"  must  be  com¬ 
bined  with  logical  connectives  (m  CE  conjunction  and  dis¬ 
junction*).  CE  does  not  consider  ^11  possible  concepts  in 
"extracting"  features;  it  only  considers  all  those  concepts  that  it 

has  previously  learned.  This  has  the  good  quality  that  concept 

* 

Negation  is  strange:  in  order  to  learn  with  any  level  of  con¬ 
fidence,  the  concept  (jx  1st)  /  'A',  at  least  25  inputs  must  be 
given. 
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learning  Improves  with  time  since  the  repertoire  of  previously 
learned  concepts  is  increasing. 

Unfortunately,  the  previously  learned  concepts  come  from  some 
concept  learning  task  given  to  CE.  If  CE  were  to  do  genuine  feature 
extraction,  it  would  have  to  generate,  internally,  concept  learning 
tasks  as  subproblems  for  some  given  task.  That  is,  concept  learning 
tasks  must  be  given  to  CE  in  a  logical  order.*  Thus,  it  is  that 
first  graders  are  taught  the  alphabet  before  they  are  taught  to 


read. 

To  our  knowledge  the  only  work  that  does  feature  extraction  is 
ohr  and  Vossler  p6].  However,  the  features  of  CE  are  more  general 
than  the  templets  used  in  Uhr  and  Vossler  [16 j.  The  templets  are 
fixed  in  size  and  are  inherently  conjunctive.  In  CE  a  feature 
pertains  to  any  concept,  either  conjunctive,  disjunctive  or 
hierarchlal,  e.g.,  the  disjunct’/e  concept  of  all  as. 

In  CE  the  basis  for  learning  is  the  question  ^exemplar*  ?  +. 

The  answer  to  this  question,  <name>  ,  must  be  retrieved  from  the  CE 
memory.  This  process  is  how  CE  differs  from  other  work.  The 
implied  organization**  of  other  concept  learners  (e.g.,  Banerji  [1], 
Hunt  [8],  Sebestyen  [13])  is  to  look  up  the  name  of  a  concept  in  a 
table  to  find  a  pointer  to  the  description  of  the  concept.  Thus,  to 

- j - 

CE  will  learn  any  concept;  however,  if  the  correct  subconcepts 
are  not  learned  then  learning  will  be  difficult. 

** 

The  organization  is  not  stated  because  they  do  not  have  in 
mind  a  memory  full  of  concepts.  It  appears  to  us  that  the  table 
look  up  method  is  indicated. 


<£**> 
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answer  the  question  <exemplar>  ?  +  such  a  program  must  start  at  the 
head  of  a  table  and  apply  the  description  of  the  first  concept  to 
the  exemplar,  then  the  description  of  the  second  concept,  etc. 

Since  each  object  may  be  in  several  concepts,  the  search  cannot 
stop  with  a  match,  but  must  go  through  the  entire  table.  In  CE  an 
input  is  sorted  on  the  exemplar  to  a  particular  terminal,  and  the 
name  is  retrieved.  If  an  exemplar  is  contained  in  several  concepts, 
then  a  test  on  name  occurs  and  a  limited  search  is  necessary. 
However,  never  does  CE  search  the  entire  memory.  Thus,  the  memory 
organization  of  CE  is  its  main  advantage  because  it  answers 
exemplar  ?  +  with  only  a  limited  search. 

One  of  the  primary  design  criteria  is  that  questions  of  the 
form,  exemplar  name>  ?,  can  be  answered  without  search. 

Algorithm  1  allows  CE  to  do  this.  However,  the  answer  may  be 
wrong  if  the  concept  was  learned  incorrectly,  and  CE  will  not 
answer  at  all  if  it  has  not  learned  sufficient  information. 

The  geometry  analogy  task  at  the  end  of  Chapter  VI  requires 
relational  concepts.  Certainly  any  good  concept  learner  must  have 
this  capability.  The  program  that  was  written  learns  concepts  in 
terms  of  conjunctive  and  disjunctive  concepts.  Algorithm  2  is  an 
attempt  to  retrieve  relational  concepts  without  search.  However, 
the  whole  issue  of  relational  concepts  needs  more  investigation. 

The  basic  difficulty  is  that  identity  is  meaningless.  The 
Statement ,  □  -  □  ,  does  not  mean  that  they  are  identical  but 

that  they  both  possess  the  key  features  of  a  square.  The 


complication  is  that  they  are  rectangles  and  four  sided  figures 
also. 


Another  difficulty  is  that  CE  cannot  describe  general 
relational  concepts  although  CE  can  learn  relations.  This  dif¬ 
ficulty  can  be  removed  by  the  technique  of  Banerji  [1].  The  set 
membership  test  can  be  extended  to  form  a  test  that  involves 
several  subparts  of  an  object,  e.g,  ((ex  1st)  (ex  3rc*))  £  NEXT'. 
For  example,  BFC  would  pass  the  test  because  C  is  next  to  B  in  the 
alphabet.  However,  the  learning  of  conceDts  involving  arbitrary 
relations  is  a  difficult  problem- 


APPENDIX  A 


COMPUTER  INPUT  AND  OUTPUT 

Figure  13  shows  some  inputs  given  to  the  executive.  The 
relevant  attributes  of  simple  objects  are  given.  The  list  of  in¬ 
puts  for  a  particular  experiment  is  illustrated  on  the  list 
labeled  DI.  The  executive  adds  random  feature  values  to  the 
description  of  an  input  to  produce  a  "typical 11  input  shown  in 
Figure  14. 

The  concept  CLI,  corresponding  to  the  concept  Cl  of  Section 
6.2,  is  shown  as  a  subtree  of  CE  in  Figure  15.  The  encoded  output 
is  given  to  make  the  i nterpretation  easier. 
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A  LST 

AP=(LEVFLi  (F2  V1XF3  V2XF4  ViXFb  ViXFb  V1XF7  V2XF8  VD  ) 

BP=<LEVFLl  (F2  V1XF3  V2XF4  VD  (F5  VI)  (Fb  V*)  (F7  VIXFB  VD) 

CP=(LEVFLl  (F2  V1XF3  V1XF4  V2)(Fb  ViXFb  VlXF?  V2XF8  V2X 

AQ=(LEVFLl  (F2  VI)  IF.?  vl)(F4  V2)(Fb  V2)  (Fb  V1XF7  V2XF8  VI)) 

BQ=(LEVFL1  <F2  ViXFb  V1HF4  V2XFS  ViXFb  Vi)(F7  V2)  ( F 8  VD) 

CQ=(L£VFLl  (F2  VI )  l  Fb  Vl)(F4  V2XFS  V1XF6  VD  (F7  V2)  (rd  vl)) 
<»Q=«L£VFLl  (F2  V2XFb  VD  (F4  Vl)(Fb  V1XF6  V2XF7  VlXrS  V*)) 

ttt“  ( LEVFLl  CF2  VI  XFb  V2XF4  viXFb  ViXFb  V2XF7  v!XF8  VD) 

FQ=(LEVFLi  (F2  V2HF3  V2Xd»  ViXFb  ViXFb  V2XF7  V2XF6  VD  ) 

OQ=(LEVFLl  (F2  V2XF3  Vl)<F4  ViXFb  VI )  (Fb  VD  (F7  V2XF6  VI )  ) 

PQ=(LEVFL1  (F2  VEXFb  V?)(F4  VI )  (Fb  V2XF6  VD  (F7  vl)<F8  VD) 

HQ- (LEVFLl  (F2  VD  (Fb  V2 ) ( F4  VD  (Fb  V2)(F6  VD(F7  V1MF8  V2>) 

IG=(LEVFLi  ( F2  V2XF3  V2X  F*  V2)<FS  ViXFb  V1XF7  VlXFtt  v 2)) 
JQ=<LEVFLl  ( F2  iXFb  V1XF4  VDlFb  V2)(F6  VD  (F7  VIXFB  VD  ) 
LO”  ( LEVFLl  ( F2  ViXFb  V2>(e4  Vi)  (F5  V2)(Fo  V2XF?  V2)(Fb  VD) 
UQ=(LEVFL1  (F2  VDlFb  V2)  F4  VDlFb  V2)(F6  VD  (F7  Vl)(F8  VD  ) 
VQ=(LEVFLl  (F2  VDIFJ  V1XF4  VD  (Fb  V2  )  (Fb  V2HF7  VD  (F8  VD) 

•(Q-  (LEVFLl  (F2  VI)  (Fb  V2)lF’:  Vl)(Fb  V  2  )  ( F  6  VD  (F7  V2XF8  VD  ) 

XQ=(LEVFLl  ( c2  ViXFb  VlX  F4  V2XFS  VDlFb  V2)(F7  VD(F8  V2)  ) 
^Q=  ( LEVFLl  (F2  VD(F3  VD(F4  V’  >  (F5  V2)  (F6  VD  IF7  V2)(F8  V2X 
LAF=(LEVLL2  (FIRST  C«) (SECOND  AQ) ( THREE  PQ)) 

V0*=(LEVtL2  (FIRST  VW)  (SECOND  LQX  THREE  *Q)  ) 

LLl=(LEvtL2  (FIRST  CUXSECONd  LQ)  ( THREE  IQ)  ) 

01-  ( 

(Ext  (EXM  ABE)  (NAM  CLD  (SION  ♦)) 

(Ext (exm  gcexnam  cldisign  +)) 

(EXT  (EXM  CEBXNAM  CUXSIGN  -)) 

(EXT(£XM  bCAXNAM  CLlXbloN  ♦)) 

(EXT (EXM  DDF)  (NAM  CLIXSIGN  -)) 

(EXT  (EXM  DCbXNAM  CLIXSIGN  ♦)) 

(EXT (EXM  AGBXNAM  CLI)(SIGN  -)) 

) 

ABE=(LEVtL2  (FIRST  AU) (SECOND  BQ) (THREE  EQ) ) 

QCE=(LEPEL2  (FIRST  GU) (SECOND  LQ) (THREE  EQ) ) 

CEB=(L£VEL2  (FIRST  CU) (SECOND  EQ) (THREE  BQ)) 

BCA=(LEVEL2  (FIRST  BQ) (SECOND  LQ) (THREE  AQ) ) 

QDFs(LEVEL2  (FIRST  GU) (SECOND  DQ ) ( THREE  FQ) ) 

0CB=(LEVtL2  (FIRST  DU) (SECOND  LQ) (THREE  BQ) ) 

AGB=(L£VEL2  (FIRST  AU) (SECOND  bQ) ( THREE  BQ) ) 

A  FIN 


Figure  13 
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A  learning  program  designated  CE,  Concept-EPAK,  is  described  that  modifies 
EPAM  through  the  introduction  of  a  S8t  membership  relation.  The  effects  of 
this  extension  are  considered  with  respect  to  methods  of  storing  concept 
descriptions  in  memory  and  methods  of  specifying  learning  and  retrieval. 

The  learning  strategies  consist  of  interactions  between  image  elaboration  and 
tree  modification.  Implementations  ot  CE  are  considered  for  a  concept 
learning  task  and  a  pair  associate  task.  Applicability  of  CE  to  a  geometry 
analogy  task  requiring  relational  concepts  is  discussed.  The  relationship 
between  the  learning  of  concepts  of  concepts  and  feature  extraction  is 
illustrated. 
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